CLAIMS 

1 1. (currently amended) A mnlmil um i pi iMii{g lIil atups of The invention of claim 27, 

2 wherein : 

3 the one or more cue codes comprise a plurality of scene parameters: 

4 the E transmitted channels comprise a combined audio signal: 

5 [[(a)]] generating the one or more cue codes and downmixing the C input channels comprises 

6 converting a plurality of the input audio signals into [[a]] the combined audio signal and [[a]] the 

7 plurality of auditory scene parameters; and 

8 [[(b)]] further comprising embedding the auditory scene parameters into the combined audio 

9 signal to generate an embedded audio signal, such that: 

10 a first receiver that is aware of the existence of the embedded auditory scene parameters can 

1 1 extract the auditory scene parameters from the embedded audio signal and apply the extracted auditory 

12 scene parameters to synthesize an auditory scene; and 

13 a second receiver that is unaware of the existence of the embedded auditory scene parameters 

14 can process the embedded audio signal to generate an output audio signal, where the embedded auditory 

1 5 scene parameters are transparent to the second receiver. 

1 2. (original) The invention of claim 1 , wherein the plurality of auditory scene parameters 

2 comprise two or more different sets of one or more auditory scene parameters, wherein each set of 

3 auditory scene parameters corresponds to a different frequency band in the combined audio signal such 

4 that the first receiver synthesizes the auditory scene by (a) dividing an input audio signal into a plurality 

5 of different frequency bands; and (b) applying the two or more different sets of one or more auditory 

6 scene parameters to two or more of the different frequency bands in the input audio signal to generate 

7 two or more synthesized audio signals of the auditory scene, wherein for each of the two or more 

8 different frequency bands, the corresponding set of one or more auditory scene parameters is applied to 

9 the input audio signal as if the input audio signal corresponded to a single audio source in the auditory 
10 scene. 

1 3. (original) The invention of claim 2, wherein each set of one or more auditory scene 

2 parameters corresponds to a different audio source in the auditory scene. 

1 4. (original) The invention of claim 2, wherein, for at least one of the sets of one or more 

2 auditory scene parameters, at least one of the auditory scene parameters corresponds to a combination of 

3 two or more different audio sources in the auditory scene that takes into account relative dominance of 

4 the two or more different audio sources in the auditory scene. 

1 5. (original) The invention of claim 2, wherein the two or more synthesized audio signals 

2 comprise left and right audio signals of a binaural signal corresponding to the auditory scene. 

1 6. (original) The invention of claim 2, wherein the two or more synthesized audio signal 

2 comprise three or more signals of a multi-channel audio signal corresponding to the auditory scene. 

1 7. (original) The invention of claim 1 , wherein the combined audio signal corresponds to a 

2 combination of two or more different mono source signals, wherein the two or more different frequency 

3 bands are selected by comparing magnitudes of the two or more different mono source signals, wherein, 

4 for each of the two or more different frequency bands, one of the mono source signals dominates the one 

5 or more other mono source signals. 

1 8. (original) The invention of claim 1 , wherein the combined audio signal corresponds to a 

2 combination of left and right audio signals of a binaural signal, wherein each different set of one or more 
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3 auditory scene parameters is generated by comparing the left and right audio signals in a corresponding 

4 frequency band. 

1 9. (original) The invention of claim 1 , wherein the auditory scene parameters comprise one 

2 or more of an interaural level difference, an interaural time delay, and a head-related transfer function. 

1 1 0. (original) The invention of claim 1 , wherein step (b) comprises the step of applying a 

2 layered coding technique in which stronger error protection is provided to the combined audio signal 

3 than to the auditory scene parameters when generating the embedded audio signal, such that errors due to 

4 transmission over a lossy channel will tend to affect the auditory scene parameters before affecting the 

5 combined audio signal to improve the probability of the first receiver to process at least the combined 

6 audio signal. 

1 11. (original) The invention of claim 1 , wherein step (b) comprises the step of applying a 

2 multi-descriptive coding technique in which the auditory scene parameters and the combined audio 

3 signal are both divided into two or more streams, wherein each stream divided from the auditory scene 

4 parameters is embedded into a corresponding stream divided from the combined audio signal to form a 

5 stream of the embedded audio signal, such that the two or more streams of the embedded audio signal 

6 may be transmitted over two or more different channels to the first receiver, such that the first receiver is 

7 able to synthesize the auditory scene using extracted auditory scene parameters having relatively coarse 

8 resolution when errors result from transmission of one or more of the streams of the embedded audio 

9 signal over one or more lossy channels. 

1 12. (currently amended) A machine-readable medium, having encod e d thereon program 

2 cod e , wherein, when th e program code is ex e cut e d by a machine, the machine implements a method, 

3 comprising the steps o f The invention of claim 37, wherein : 

4 the one or more cue codes comprise a plurality of scene parameters: 

5 the E transmitted channels comprise a combined audio signal: 

6 [[(a)]] the generating and providing means comprise means for converting a plurality of the input 

7 audio signals into [[a]] the combined audio signal and [[a]] the plurality of auditory scene parameters; 

8 and 

9 [[(b)]] further comprising means for embedding the auditory scene parameters into the combined 

1 0 audio signal to generate an embedded audio signal, such that: 

11 a first receiver that is aware of the existence of the embedded auditory scene parameters can 

12 extract the auditory scene parameters from the embedded audio signal and apply the extracted auditory 

1 3 scene parameters to synthesize an auditory scene; and 

14 a second receiver that is unaware of the existence of the embedded auditory scene parameters 

1 5 can process the embedded audio signal to generate an output audio signal, where the embedded auditory 

1 6 scene parameters are transparent to the second receiver. 

1 13. (currently amended) An apparatus comprising The invention of claim 38. wherein : 

2 the one or more cue codes comprise a plurality of scene parameters: 

3 the E transmitted channels comprise a combined audio signal: 

4 [[(a)]] the code estimator and the downmixer are part of an encoder configured to convert a 

5 plurali t y of the input audio signals into [[a]] the combined audio signal and [[a]] the plurality of auditory 

6 scene parameters; and 

7 [[(b)]] further comprising a merging module configure to embed the auditory scene parameters 

8 into the combined audio signal to generate an embedded audio signal, such that: 



Serial No. 10/045,458 



Baumgarte 1-6-8 (992.1013) 



9 a first receiver that is aware of the existence of the embedded auditory scene parameters can 

10 extract the auditory scene parameters from the embedded audio signal and apply the extracted auditory 

1 1 scene parameters to synthesize an auditory scene; and 

12 a second receiver that is unaware of the existence of the embedded auditory scene parameters 

13 can process the embedded audio signal to generate an output audio signal, where the embedded auditory 

14 scene parameters are transparent to the second receiver. 

1 14. (currently amended) A method fin synthesizing an auditory scene The invention of 

2 claim 1 , further comprising the steps of: 

3 [[(a)]] receiving [[an]] the embedded audio signal comprising [[a]] the combined audio signal 

4 embedded with [[a]] the plurality of auditory scene parameters, wherein a receiver that is unaware of the 

5 existence of the embedded auditory scene parameters can process the embedded audio signal to generate 

6 an output audio signal, where the embedded auditory scene parameters are transparent to the receiver; 

7 [[(b)]] extracting the auditory scene parameters from the embedded audio signal; and 

8 [[(c)]] applying the extracted auditory scene parameters to the combined audio signal to 

9 synthesize an auditory scene. 

10 15. (original) The invention of claim 14, wherein the plurality of auditory scene parameters 

1 1 comprise two or more different sets of one or more auditory scene parameters, wherein each set of 

12 auditory scene parameters corresponds to a different frequency band in the combined audio signal such 

13 that the auditory scene is synthesized by (1) dividing the combined audio signal into a plurality of 

14 different frequency bands; and (2) applying the two or more different sets of one or more auditory scene 

15 parameters to two or more of the different frequency bands in the combined audio signal to generate two 

16 or more synthesizied audio signals of the auditory scene, wherein for each of the two or more different 

17 frequency bands, the corresponding set of one or more auditory scene parameters is applied to the 

18 combined audio signal as if the combined audio signal corresponded to a single audio source in the 

19 auditory scene. 

1 16. (original) The invention of claim 1 5, wherein each set of one or more auditory scene 

2 parameters corresponds to a different audio source in the auditory scene. 

1 1 7. (original) The invention of claim 15, wherein, for at least one of the sets of one or more 

2 auditory scene parameters, at least one of the auditory scene parameters corresponds to a combination of 

3 two or more different audio sources in the auditory scene that takes into account relative dominance of 

4 the two or more different audio sources in the auditory scene. 

1 18. (original) The invention of claim 15, wherein the two or more synthesized audio signals 

2 comprise left and right audio signals of a binaural signal corresponding to the auditory scene. 

1 19. (original) The invention of claim 1 5, wherein the two or more synthesized audio signal 

2 comprise three or more signals of a multi-channel audio signal corresponding to the auditory scene. 

1 20. (original) The invention of claim 14, wherein the combined audio signal corresponds to 

2 a combination of two or more different mono source signals, wherein the two or more different 

3 frequency bands are selected by comparing magnitudes of the two or more different mono source signals, 

4 wherein, for each of the two or more different frequency bands, one of the mono source signals 

5 dominates the one or more other mono source signals. 

1 21. (original) The invention of claim 14, wherein the combined audio signal corresponds to 

2 a combination of left and right audio signals of a binaural signal, wherein each different set of one or 
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3 more auditory scene parameters is generated by comparing the left and right audio signals in a 

4 corresponding frequency band. 

1 22. (original) The invention of claim 14, wherein the auditory scene parameters comprise 

2 one or more of an interaural level difference, an interaural time delay, and a head-related transfer 

3 function. 

1 23. (original) The invention of claim 1 4, wherein the embedded audio signal was generated 

2 by applying a layered coding technique in which stronger error protection was provided to the combined 

3 audio signal than to the auditory scene parameters, such that errors due to transmission over a lossy 

4 channel will tend to affect the auditory scene parameters before affecting the combined audio signal to 

5 improve the probability of a receiver to process at least the combined audio signal. 

1 24. (original) The invention of claim 14, wherein the embedded audio signal was generated 

2 by applying a multi-descriptive coding technique in which the auditory scene parameters and the 

3 combined audio signal were both divided into two or more streams, wherein each stream divided from 

4 the auditory scene parameters was embedded into a corresponding stream divided from the combined 

5 audio signal to form a stream of the embedded audio signal, such that the two or more streams of the 

6 embedded audio signal may be transmitted over two or more different channels to a receiver, such that 

7 the receiver is able to synthesize the auditory scene using extracted auditory scene parameters having 

8 relatively coarse resolution when errors result from transmission of one or more of the streams of the 

9 embedded audio signal over one or more lossy channels. 

1 25. (currently amended) A machine-readable m e dium, having encod e d ther e on program 

2 cod e , wh e r e in, when the pr o gram c o d e is executed by a machin e , th e machine implements a method for 

3 synth e sizing an audit o ry scene The invention of claim 12 , further comprising th e steps of : 

4 [[(a)]] means for receiving [[an]] the embedded audio signal comprising [[a]] the combined 

5 audio signal embedded with [[a]] the plurality of auditory scene parameters, wherein a receiver that is 

6 unaware of the existence of the embedded auditory scene parameters can process the embedded audio 

7 signal to generate an output audio signal, where the embedded auditory scene parameters are transparent 

8 to the receiver; 

9 [[(b)]] means for extracting the auditory scene parameters from the embedded audio signal; and 

1 0 [[(c)]] means for applying the extracted auditory scene parameters to the combined audio signal 

11 to synthesize an auditory scene. 

1 26. (currently amended) An appara t us for synthesizing an auditory sc e ne The invention of 

2 claim 13 . further comprising: 

3 [[(a)]] a dividing module configured to (1) receive [[an]] the embedded audio signal comprising 

4 [[a]] the combined audio signal embedded with [[a]] the plurality of auditory scene parameters, wherein 

5 a receiver that is unaware of the existence of the embedded auditory scene parameters can process the 

6 embedded audio signal to generate an output audio signal, where the embedded auditory scene 

7 parameters are transparent to the receiver and (2) extract the auditory scene parameters from the 

8 embedded audio signal; and 

9 [[(b)]] a decoder configure to apply the extracted auditory scene parameters to the combined 
10 audio signal to synthesize an auditory scene. 

1 27. (previously presented) A method for encoding C input audio channels to generate E 

2 transmitted audio channels, the method comprising: 

3 providing two or more of the C input channels in a frequency domain; 
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4 generating one or more cue codes for each of one or more different frequency bands in the two or 

5 more input channels in the frequency domain; and 

6 downmixing the C input channels to generate the E transmitted channels, where OE* 1 . 

1 28. (previously presented) The invention of claim 27, further comprising formatting the E 

2 transmitted channels and the one or more cue codes into a transmission format such that: 

3 the format enables a first audio decoder having no knowledge of the existence of the one or more 

4 cue codes to generate E playback audio channels based on the E transmitted channels and independent of 

5 the one or more cue codes; and 

6 the format enables a second audio decoder having knowledge of the existence of the one or more 

7 cue codes to generate more than E playback audio channels based on the E transmitted channels and the 

8 one or more cue codes. 

1 29. (previously presented) The invention of claim 28, wherein the format enables the second 

2 audio decoder to generate C playback audio channels based on the E transmitted channels and the one or 

3 more cue codes. 

1 30. (previously presented) The invention of claim 27, wherein E=l . 

1 31. (previously presented) The invention of claim 27, wherein E> 1 . 

1 32. (previously presented) The invention of claim 27, wherein each of the E transmitted 

2 channels is based on two or more of the C input channels. 

1 33. (previously presented) The invention of claim 27, wherein the one or more cue codes 

2 comprise one of more of inter-channel level difference (ICLD) data and inter-channel time difference 

3 (ICTD) data. 

1 34. (previously presented) The invention of claim 33, wherein the one or more cue codes 

2 comprise ICLD data and ICTD data. 

1 35. (previously presented) The invention of claim 27, wherein the downmixing comprises, 

2 for each of one or more different frequency bands, downmixing the two or more input channels in the 

3 frequency domain into one or more downmixed channels in the frequency domain. 

1 36. (previously presented) The invention of claim 35, wherein the downmixing further 

2 comprises converting the one or more downmixed channels from the frequency domain into one or more 

3 of the transmitted channels in the time domain. 

1 37. (currently amended) An audio coder Apparatus for encoding C input audio channels to 

2 generate E transmitted audio channels, the audio coder a pparatus comprising: 

3 means for providing two or more of the C input channels in a frequency domain; 

4 means for generating one or more cue codes for each of one or more different frequency bands in 

5 the two or more input channels in the frequency domain; and 

6 means for downmixing the C input channels to generate the E transmitted channels, where 

7 OEzl. 

1 38. (previously presented) Apparatus for encoding C input audio channels to generate E 

2 transmitted audio channels, the apparatus comprising: 
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3 two or more filter banks adapted to convert two or more of the C input channels from a time 

4 domain into a frequency domain; 

5 a code estimator adapted to generate one or more cue codes for each of one or more different 

6 frequency bands in the two or more converted input channels; and 

7 a downmixer adapted to downmix the C input channels to generate the E transmitted channels, 

8 where OEzl. 

1 39. (previously presented) The invention of claim 38, wherein the apparatus is adapted to 

2 format the E transmitted channels and the one or more cue codes into a transmission format such that: 

3 the format enables a first audio decoder having no knowledge of the existence of the one or more 

4 cue codes to generate E playback audio channels based on the E transmitted channels and independent of 

5 the one or more cue codes; and 

6 the format enables a second audio decoder having knowledge of the existence of the one or more 

7 cue codes to generate more than E playback audio channels based on the E transmitted channels and the 

8 one or more cue codes. 

1 40. (previously presented) The invention of claim 39, wherein the format enables the second 

2 audio decoder to generate C playback audio channels based on the E transmitted channels and the one or 

3 more cue codes. 

1 41 . (previously presented) The invention of claim 38, wherein E=\ . 

1 42. (previously presented) The invention of claim 38, wherein E>\ . 

1 43. (previously presented) The invention of claim 38, wherein each of the E transmitted 

2 channels is based on two or more of the C input channels. 

1 44. (previously presented) The invention of claim 38, wherein the one or more cue codes 

2 comprise one of more of ICLD data and ICTD data. 

1 45. (previously presented) The invention of claim 44, wherein the one or more cue codes 

2 comprise ICLD data and ICTD data. 

1 46. (previously presented) The invention of claim 38, wherein the downmixer is adapted, 

2 for each of one or more different frequency bands, to downmix the two or more converted input channels 

3 into one or more downmixed channels in the frequency domain. 

1 47. (previously presented) The invention of claim 46, further comprising one or more 

2 inverse filter banks adapted to convert the one or more downmixed channels from the frequency domain 

3 into one or more the transmitted channels in the time domain. 

1 48. (previously presented) The invention of claim 38, wherein: 

2 the apparatus is a system selected from the group consisting of a digital video recorder, a digital 

3 audio recorder, a computer, a satellite transmitter, a cable transmitter, a terrestrial broadcast transmitter, 

4 and an entertainment system; and 

5 the system comprises the two or more filter banks, the code estimator, and the downmixer. 

1 49. (previously presented) An encoded audio bitstream generated by encoding C input audio 

2 channels to generate E transmitted audio channels, wherein: 

3 two or more of C input channels are provided in a frequency domain; 



Serial No. 10/045,458 



Baumgaite 1-6-8 (992.1013) 



4 one or more cue codes are generated for each of one or more different frequency bands in the 

5 two or more input channels in the frequency domain; 

6 the C input channels are downmixed to generate E transmitted channels, where OE± 1 ; and 

7 the E transmitted channels and the one or more cue codes are encoded into the encoded audio 

8 bitstream. 

1 50. (previously presented) The invention of claim 49, the encoded audio bitstream has a 

2 transmission format such that: 

3 the format enables a first audio decoder having no knowledge of the existence of the one or more 

4 cue codes to generate E playback audio channels based on the E transmitted channels and independent of 

5 the one or more cue codes; and 

6 the format enables a second audio decoder having knowledge of the existence of the one or more 

7 cue codes to generate more than E playback audio channels based on the E transmitted channels and the 

8 one or more cue codes. 

1 51. (previously presented) The invention of claim 50, wherein the format enables the second 

2 audio decoder to generate C playback audio channels based on the E transmitted channels and the one or 

3 more cue codes. 

1 52. (previously presented) An encoded audio bitstream comprising E transmitted channels 

2 and one or more cue codes, wherein: 

3 the one or more cue codes are generated by: 

4 providing two or more of C input audio channels in a frequency domain; and 

5 generating one or more cue codes for each of one or more different frequency bands in 

6 the two or more input channels in the frequency domain; and 

7 the E transmitted channels are generated by downmixing the C input channels, where OEz 1 . 

1 53. (previously presented) The invention of claim 52, the encoded audio bitstream has a 

2 transmission format such that: 

3 the format enables a first audio decoder having no knowledge of the existence of the one or more 

4 cue codes to generate E playback audio channels based on the E transmitted channels and independent of 

5 the one or more cue codes; and 

6 the format enables a second audio decoder having knowledge of the existence of the one or more 

7 cue codes to generate more than E playback audio channels based on the E transmitted channels and the 

8 one or more cue codes. 

1 54. (previously presented) The invention of claim 53, wherein the format enables the second 

2 audio decoder to generate C playback audio channels based on the E transmitted channels and the one or 

3 more cue codes. 

1 55. (currently amended) A method for dec o ding E transmitted audi o channels to generate C 

2 playback audio channels, the me thod The invention of claim 27. further comprising: 

3 upmixing, for each of one or more different frequency bands, one or more of the E transmitted 

4 channels in a frequency domain to generate two or more of [[the]] C playback channels in the frequency 

6 applying the one or more cue codes to each of the one or more different frequency bands in the 

7 two or more playback channels in the frequency domain to generate two or more modified channels; and 

8 converting the two or more modified channels from the frequency domain into a time domain. 
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1 56. (previously presented) The invention of claim 55, further comprising, prior to upmixing, 

2 converting the one or more of the E transmitted channels from the time domain to the frequency domain. 

1 57. (previously presented) The invention of claim 55, wherein E=l. 

1 58. (previously presented) The invention of claim 55, wherein E>\ . 

1 59. (previously presented) The invention of claim 55, wherein each of the C playback 

2 channels is based on at least one of the E transmitted channels and at least one cue code. 

1 60. (previously presented) The invention of claim 55, wherein the one or more cue codes 

2 comprise one of more of ICLD data and ICTD data. 

1 61. (previously presented) The invention of claim 60, wherein the one or more cue codes 

2 comprise ICLD data and ICTD data. 

1 62. (previously presented) The invention of claim 55, wherein the upmixing comprises, for 

2 each of one or more different frequency bands, upmixing at least two of the E transmitted channels into 

3 at least one playback channel in the frequency domain. 

1 63. (currently amended) An audi o d e coder f o r d e coding E transmitted audio channels to 

2 generate C playback audio channels, the audio decoder The invention of claim 37, further comprising: 

3 means for upmixing, for each of one or more different frequency bands, one or more of the E 

4 transmitted channels in a frequency domain to generate two or more of [[the]] C playback channels in the 

5 frequency domai n, where OE^ 1 ; 

6 means for applying one or more cue codes to each of the one or more different frequency bands 

7 in the two or more playback channels in the frequency domain to generate two or more modified 

8 channels; and 

9 means for converting the two or more modified channels from the frequency domain into a time 
10 domain. 

1 64. (currently amended) An apparatus for decoding E transmitted audio channels to 

2 generate C playback audio channels, the appara t us The invention of claim 38. further comprising: 

3 an upmixer adapted, for each of one or more different frequency bands, to upmix one or more of 

4 the E transmitted channels in a frequency domain to generate two or more of [[the]] C playback channels 

5 in the frequency domai n, where OE* 1 ; 

6 a synthesizer adapted to apply one or more cue codes to each of the one or more different 

7 frequency bands in the two or more playback channels in the frequency domain to generate two or more 

8 modified channels; and 

9 one or more inverse filter banks adapted to convert the two or more modified channels from the 
1 0 frequency domain into a time domain. 

1 65. (previously presented) The invention of claim 64, further comprising one or more filter 

2 banks adapted to convert, prior to the upmixing, the one or more of the E transmitted channels from the 

3 time domain to the frequency domain. 

1 66. (previously presented) The invention of claim 64, wherein E=\ . 

1 67. (previously presented) The invention of claim 64, wherein E>\ . 
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1 68. (previously presented) The invention of claim 64, wherein each of the C playback 

2 channels is based on at least one of the E input channels and at least one cue code. 

1 69. (previously presented) The invention of claim 64, wherein the one or more cue codes 

2 comprise one of more of ICLD data and ICTD data. 

1 70. (previously presented) The invention of claim 69, wherein the one or more cue codes 

2 comprise ICLD data and ICTD data. 

1 71 . (previously presented) The invention of claim 64, wherein the upmixer is adapted, for 

2 each of one or more different frequency bands, to upmix at least two of the E transmitted channels into at 

3 least one playback channel in the frequency domain. 

1 72. (previously presented) The invention of claim 64, wherein: 

2 the apparatus is a system selected from the group consisting of a digital video player, a digital 

3 audio player, a computer, a satellite receiver, a cable receiver, a terrestrial broadcast receiver, and an 

4 entertainment system; and 

5 the system comprises the upmixer, the synthesizer, and the one or more inverse filter banks. 
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